深神经网络(DNN)的庞大计算和记忆成本通常排除了它们在资源约束设备中的使用。将参数和操作量化为较低的位精确,为神经网络推断提供了可观的记忆和能量节省,从而促进了在边缘计算平台上使用DNN。量化DNN的最新努力采用了一系列技术,包括渐进式量化,步进尺寸的适应性和梯度缩放。本文提出了一种针对边缘计算的混合精度卷积神经网络(CNN)的新量化方法。我们的方法在模型准确性和内存足迹上建立了一个新的Pareto前沿,展示了一系列量化模型,可提供低于4.3 MB的权重(WGTS。)和激活(ACTS。)。我们的主要贡献是:(i)用张量学的学习精度,(ii)WGTS的靶向梯度修饰,(i)硬件感知的异质可区分量化。和行为。为了减轻量化错误,以及(iii)多相学习时间表,以解决从更新到学习的量化器和模型参数引起的学习不稳定性。我们证明了我们的技术在Imagenet数据集上的有效性,包括高效网络lite0(例如,WGTS。的4.14MB和ACTS。以67.66%的精度)和MobilenEtV2(例如3.51MB WGTS。 % 准确性)。
translated by 谷歌翻译
The term ``neuromorphic'' refers to systems that are closely resembling the architecture and/or the dynamics of biological neural networks. Typical examples are novel computer chips designed to mimic the architecture of a biological brain, or sensors that get inspiration from, e.g., the visual or olfactory systems in insects and mammals to acquire information about the environment. This approach is not without ambition as it promises to enable engineered devices able to reproduce the level of performance observed in biological organisms -- the main immediate advantage being the efficient use of scarce resources, which translates into low power requirements. The emphasis on low power and energy efficiency of neuromorphic devices is a perfect match for space applications. Spacecraft -- especially miniaturized ones -- have strict energy constraints as they need to operate in an environment which is scarce with resources and extremely hostile. In this work we present an overview of early attempts made to study a neuromorphic approach in a space context at the European Space Agency's (ESA) Advanced Concepts Team (ACT).
translated by 谷歌翻译
This paper presents the OPUS ecosystem with a focus on the development of open machine translation models and tools, and their integration into end-user applications, development platforms and professional workflows. We discuss our on-going mission of increasing language coverage and translation quality, and also describe on-going work on the development of modular translation models and speed-optimized compact solutions for real-time translation on regular desktops and small devices.
translated by 谷歌翻译
The usage of deep neural networks in safety-critical systems is limited by our ability to guarantee their correct behavior. Runtime monitors are components aiming to identify unsafe predictions and discard them before they can lead to catastrophic consequences. Several recent works on runtime monitoring have focused on out-of-distribution (OOD) detection, i.e., identifying inputs that are different from the training data. In this work, we argue that OOD detection is not a well-suited framework to design efficient runtime monitors and that it is more relevant to evaluate monitors based on their ability to discard incorrect predictions. We call this setting out-ofmodel-scope detection and discuss the conceptual differences with OOD. We also conduct extensive experiments on popular datasets from the literature to show that studying monitors in the OOD setting can be misleading: 1. very good OOD results can give a false impression of safety, 2. comparison under the OOD setting does not allow identifying the best monitor to detect errors. Finally, we also show that removing erroneous training data samples helps to train better monitors.
translated by 谷歌翻译
A recent popular approach to out-of-distribution (OOD) detection is based on a self-supervised learning technique referred to as contrastive learning. There are two main variants of contrastive learning, namely instance and class discrimination, targeting features that can discriminate between different instances for the former, and different classes for the latter. In this paper, we aim to understand the effectiveness and limitation of existing contrastive learning methods for OOD detection. We approach this in 3 ways. First, we systematically study the performance difference between the instance discrimination and supervised contrastive learning variants in different OOD detection settings. Second, we study which in-distribution (ID) classes OOD data tend to be classified into. Finally, we study the spectral decay property of the different contrastive learning approaches and examine how it correlates with OOD detection performance. In scenarios where the ID and OOD datasets are sufficiently different from one another, we see that instance discrimination, in the absence of fine-tuning, is competitive with supervised approaches in OOD detection. We see that OOD samples tend to be classified into classes that have a distribution similar to the distribution of the entire dataset. Furthermore, we show that contrastive learning learns a feature space that contains singular vectors containing several directions with a high variance which can be detrimental or beneficial to OOD detection depending on the inference approach used.
translated by 谷歌翻译
With climate change predicted to increase the likelihood of landslide events, there is a growing need for rapid landslide detection technologies that help inform emergency responses. Synthetic Aperture Radar (SAR) is a remote sensing technique that can provide measurements of affected areas independent of weather or lighting conditions. Usage of SAR, however, is hindered by domain knowledge that is necessary for the pre-processing steps and its interpretation requires expert knowledge. We provide simplified, pre-processed, machine-learning ready SAR datacubes for four globally located landslide events obtained from several Sentinel-1 satellite passes before and after a landslide triggering event together with segmentation maps of the landslides. From this dataset, using the Hokkaido, Japan datacube, we study the feasibility of SAR-based landslide detection with supervised deep learning (DL). Our results demonstrate that DL models can be used to detect landslides from SAR data, achieving an Area under the Precision-Recall curve exceeding 0.7. We find that additional satellite visits enhance detection performance, but that early detection is possible when SAR data is combined with terrain information from a digital elevation model. This can be especially useful for time-critical emergency interventions. Code is made publicly available at https://github.com/iprapas/landslide-sar-unet.
translated by 谷歌翻译
在复杂,非结构化和动态环境中导航的董事会机器人基于在线事件的感知技术可能会遭受进入事件速率及其处理时间的不可预测的变化,这可能会导致计算溢出或响应能力损失。本文提出了尽快的:一种新型的事件处理框架,该框架将事件传输到处理算法,保持系统响应能力并防止溢出。尽快由两种自适应机制组成。第一个通过丢弃传入事件的自适应百分比来防止事件处理溢出。第二种机制动态调整事件软件包的大小,以减少事件生成和处理之间的延迟。ASAP保证了收敛性,并且对处理算法具有灵活性。它已在具有挑战性的条件下在船上进行了验证。
translated by 谷歌翻译
事件摄像机可以通过非常高的时间分辨率和动态范围来捕获像素级照明变化。由于对照明条件和运动模糊的稳健性,他们获得了越来越多的研究兴趣。文献中存在两种主要方法,用于喂养基于事件的处理算法:在事件软件包中包装触发的事件并将它们逐一发送作为单个事件。这些方法因处理溢出或缺乏响应性而受到限制。当算法无法实时处理所有事件时,处理溢出是由高事件产生速率引起的。相反,当事件包的频率太低时,事件包的生成率低时,缺乏响应率会发生。本文提出了尽快的自适应方案,该方案是通过可容纳事件软件包处理时间的可变大小软件包来管理事件流的。实验结果表明,ASAP能够以响应性和有效的方式喂食异步事件聚类算法,同时又可以防止溢出。
translated by 谷歌翻译
最近的工作表明,通过将RL任务转换为监督学习任务,通过有条件的政策来解决离线加强学习(RL)可以产生有希望的结果。决策变压器(DT)结合了条件政策方法和变压器体系结构,以显示针对多个基准测试的竞争性能。但是,DT缺乏缝线能力 - 离线RL的关键能力之一,它从亚最佳轨迹中学习了最佳策略。当离线数据集仅包含亚最佳轨迹时,问题就变得很重要。另一方面,基于动态编程(例如Q学习)的常规RL方法不会遇到相同的问题;但是,他们患有不稳定的学习行为,尤其是当它在非政策学习环境中采用功能近似时。在本文中,我们提出了通过利用动态编程(Q-Learning)的好处来解决DT的缺点的Q学习决策者(QDT)。 QDT利用动态编程(Q-学习)结果来重新标记培训数据中的返回。然后,我们使用重新标记的数据训练DT。我们的方法有效利用了这两种方法的好处,并弥补了彼此的缺点,以取得更好的绩效。我们在简单的环境中演示了DT的问题和QDT的优势。我们还在更复杂的D4RL基准测试中评估了QDT,显示出良好的性能增长。
translated by 谷歌翻译
基于人工智能和机器学习算法的数据驱动的预测模型的解释性技术使我们能够更好地了解此类系统的运行,并有助于使它们负责。新的透明度方法以惊人的速度开发,使我们能够在这些黑匣子内窥视并解释他们的决策。这些技术中的许多被引入了整体工具,给人以有限的可自定义性的一定程度和端到端算法的印象。然而,这种方法通常由多个可互换的模块组成,这些模块需要调整到手头的问题以产生有意义的解释。本文介绍了动手培训材料的集合 - 幻灯片,视频录制和jupyter笔记本 - 通过构建和评估定制的模块化替代解释器的过程为表格数据提供指导。这些资源涵盖了该技术的三个核心构建基础:可解释的表示组成,数据采样和解释生成。
translated by 谷歌翻译